Map generations in GIS and Rstudio follow the diagram below. They use the same point data in csv format and borough map in shapefile, but transform and manipulate data in various processes.
In GIS, a point feature class is created by combining longitude and latitude from csv parameters with XY fields, and exporting XY data into new feature class. Project in data management tools could convert coordinate system WGS1984 into British national grid. Then points inside borough area are selected by “select by location” button, while robbery data are chosen by equation in “select by attributes”.
To analyse the density of robbery points selected by attribute, we use kernel density rather than point density to accumulate total number of individual points in the cell (Pro.arcgis.com, 2018). The output is in the format of raster which is controlled in the cell size of 25 to save the response time but bring accuracy deviation as well. In the layer properties, appropriate range of value could be represented through “classify” button and colour ramp. In addition, we could choose the base map in “add basemap” with multiple styles.
Before exporting the map, the final step is layout by inserting legend, scale bar, title etc., and all these elements could be dragged flexibly in the layout window. Besides, it’s useful to locator the map by using extent indicators to show data frame in England scale.
While geocoding with R, point data are loaded by read_csv function from the readr package. Before manipulating the data, tidyverse library could clean interfering text characters rather than manually cleaning in GIS. Sp and rgdal is used to transform coordinate reference system (CRS) in EPSG code. Unlike location dependence in GIS, point outside boundaries selected by coordinate coding in R.
## read new csv file
crimedata3 <- read.csv("E:/UCL/005-GI System&Science/Assessment 1/2018-03-metropolitan-street-clean.csv",stringsAsFactors = F)
## clean the data out of borough
crimedata3 <- crimedata3[which(crimedata3$Latitude > 51.275), ]
crimedata3 <- crimedata3[which(crimedata3$Latitude < 51.7), ]
crimedata3 <- crimedata3[which(crimedata3$Longitude > -0.53), ]
crimedata3 <- crimedata3[which(crimedata3$Longitude < 0.35), ]
summary(crimedata3)
X Longitude2 Latitude2 Longitude Latitude Location
Min. : 2 Min. :-0.49593 Min. :51.29 Min. :-0.49593 Min. :51.29 Length:62958
1st Qu.:15757 1st Qu.:-0.23271 1st Qu.:51.47 1st Qu.:-0.23271 1st Qu.:51.47 Class :character
Median :31504 Median :-0.11567 Median :51.53 Median :-0.11567 Median :51.53 Mode :character
Mean :31502 Mean :-0.12364 Mean :51.51 Mean :-0.12364 Mean :51.51
3rd Qu.:47249 3rd Qu.:-0.01275 3rd Qu.:51.56 3rd Qu.:-0.01275 3rd Qu.:51.56
Max. :62999 Max. : 0.34284 Max. :51.69 Max. : 0.34284 Max. :51.69
LSOA.code LSOA.name Crime.type
Length:62958 Length:62958 Length:62958
Class :character Class :character Class :character
Mode :character Mode :character Mode :character
## transform coordinate system into WGS1984(epsg4326), and transform sp-sf
coordinates(crimedata3) <- c("Longitude","Latitude")
proj4string(crimedata3) <- CRS("+init=epsg:4326")
crimedata3SF <- st_as_sf(crimedata3)
crimedata3SF26 <- st_transform(crimedata3SF, 4326)
The ggplot2 package is extremely useful for a quick geospatial visualization. As geom_sf() function could draw more objects than geom_point(), we use it to plot simple features.
Ggplot2 code is flexible and compliable. For instance, colour control code inside or outside aes() receive various results. When colours in aes() are in tandem with scale_color_manual() outside aes(), the colour will be linked to corresponding legends. However, ggplot() have several complete themes similar to GIS symbology in layer properties, which simplified visualization process.
##ggplot data without classification. This plot takes almost 20 minutes while GIS takes few seconds!
ggplot()+geom_sf( aes(geometry=geometry,colour = Crime.type),alpha=0.3,show.legend = "point",data = crimedata3SF26)+scale_colour_manual(values = rainbow(14))+theme_minimal()+ labs(x = "Longitude", y = "Latitude", title = "Map of points") ###colour=...&scale_colour_manual could classify the colour on crime type
##ggplot data in a single class--"Bicycle theft"
dec.bike <- crimedata3SF26[crimedata3SF26$Crime.type == "Bicycle theft", ]
ggplot()+geom_sf( mapping=aes(geometry=geometry,colour ="Bicycle theft"),show.legend = "point",data = dec.bike)+theme_minimal()+ labs(x = "Longitude", y = "Latitude", title = "Map of points")
##plot
ggplot()+geom_sf(mapping = aes(geometry=geometry),data = basemap3SF26)+geom_sf( mapping=aes(geometry=geometry,colour ="Bicycle theft"),show.legend = "point",data = dec.bike)+theme_minimal()+ labs(x = "Longitude", y = "Latitude", title = "Borough Bicycle Theft Map")
The get_map() function can access base map data from various providers. Here we use Stamen Maps without API.
Apart from above, to plot point map in scalable range, sf and mapview function could bring us to the world of GIS in R.
Sys.setlocale(category = "LC_ALL", locale = "uk") ### bug of laptop system made in china
[1] "LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.1252;LC_MONETARY=English_United Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252"
tmap_mode("view")
tmap mode set to interactive viewing
tm_shape(BoroughMap27700) +
tm_polygons(col = NA, alpha = 0.5) +
tm_shape(crimebike) +
tm_dots(col = "blue")
The point density could be estimated by stat_density function in ggplot2. Similarly, it uses the principle of 2d Kernel density (Rdocumentation.org, 2018), but cannot set reasonable interval or classification manually.
#########Point density##########
library(viridis)
dec.bikeDF <- data.frame(dec.bike) ###change into a data frame
ggmap(map) +
stat_density2d(mapping = aes(x=Longitude2,y=Latitude2,fill=stat(nlevel), alpha=..level..),bins=250, size=50, alpha = .06,geom = "polygon", data = dec.bikeDF) + scale_fill_gradient(low = "orange", high = "red")#To better visualize the density within each facet, use stat(nlevel)
library(ggmap)
ggmap(map) +
geom_sf(mapping = aes(geometry=geometry),data = basemap3SF26,inherit.aes = FALSE,alpha=0.2)+labs(x = "Longitude", y = "Latitude")+
stat_density2d(mapping = aes(x=Longitude2,y=Latitude2,fill = stat(nlevel),alpha=..level..),bins=150, size=0.1, alpha = .06,geom = "polygon",show.legend = NA, data = dec.bikeDF) + scale_fill_continuous(low = "orange", high = "red")+ scale_alpha(range = c(0, 0.5), guide = FALSE) +
labs(fill='Point Density') +
ggtitle("London Borough Bicycle Theft Heat Map 2018.3")+
theme(plot.title=element_text(color="black",size=18,face="bold"),
legend.title = element_text(colour="black", size=12,face="bold"),
legend.position = "right")+ ###theme() change title,legend,position etc.
ggsave(paste0("density_map.png"), width = 10, height = 8, units = "in")
Coordinate system already present. Adding new coordinate system, which will replace the existing one.
The most distinctive characteristics of GIS and R are respectively real-time visualization and reproducibility. Objects and labels are elements in the form of code with separate geom functions in R. However, these data and files are rendered in layers and symbols which are manipulated by ArcToolBox that contains more analysis methods.
When dealing with large amounts of raw data or calculation, R shows obvious advantages. Additionally, the files could be overwritten in R rather than GIS when repeating operations. In addition, R is developed by packages in infinite functions, which are comparable with the plugins in QGIS instead of ArcGIS. In a word, we should combine advantages of both to tackle comprehensive issues.
Pro.arcgis.com. (2018). Differences between point, line, and kernel density—Help | ArcGIS Desktop. [online] Available at: http://pro.arcgis.com/en/pro-app/tool-reference/spatial-analyst/differences-between-point-line-and-kernel-density.htm [Accessed 25 Nov. 2018]. Rdocumentation.org. (2018). stat_density function | R Documentation. [online] Available at: https://www.rdocumentation.org/packages/ggplot2/versions/1.0.1/topics/stat_density [Accessed 25 Nov. 2018].